Skip to content

[MLIR][DLTI] Make DLTI queries visit all ancestors/respect nested scopes #115043

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

rolfmorel
Copy link
Contributor

@rolfmorel rolfmorel commented Nov 5, 2024

Change the behaviour of dlti::query(op, keys) from just finding the first DLTIQueryInterface-implementing attr of the nearest ancestor of op and only looking up keys on that attribute to continuing on to further ancestors of op in case a lookup doesn't succeed on the first encountered DLTIQueryInterface-implementing attributes.

In effect, this gives query the expected "scoped" look-up semantics in that DLTI attributes that belong to ops whose regions encompass op will now be inspected in case the lookup doesn't succeed in the "closest scope". As usual for scoped lookups we have that names/keys can be shadowed.

Includes a small fix for dlti.transform.query documentation.

Change the behaviour of a DLTI `query(op, keys)` from just finding the
first `DLTIQueryInterface`-implementing attr of the nearest ancestor of
`op` and only looking up `keys` on that attribute to continueing on to
further ancestors of `op` in case a lookup doesn't succeed on the first
encountered `DLTIQueryInterface`-implementing attributes.

In effect, this gives `query` the expected "scoped" look-up semantics
in that DLTI attributes that belong to ops whose regions encompass `op`
will now be inspected in case the lookup doesn't succeed in the "closest
scope". As usual for scoped lookups we have that names/keys can be
shadowed.

Includes a small fix for `dlti.transform.query` documentation.
@llvmbot
Copy link
Member

llvmbot commented Nov 5, 2024

@llvm/pr-subscribers-mlir-dlti

@llvm/pr-subscribers-mlir

Author: Rolf Morel (rolfmorel)

Changes

Change the behaviour of a DLTI query(op, keys) from just finding the first DLTIQueryInterface-implementing attr of the nearest ancestor of op and only looking up keys on that attribute to continueing on to further ancestors of op in case a lookup doesn't succeed on the first encountered DLTIQueryInterface-implementing attributes.

In effect, this gives query the expected "scoped" look-up semantics in that DLTI attributes that belong to ops whose regions encompass op will now be inspected in case the lookup doesn't succeed in the "closest scope". As usual for scoped lookups we have that names/keys can be shadowed.

Includes a small fix for dlti.transform.query documentation.


Full diff: https://github.com/llvm/llvm-project/pull/115043.diff

4 Files Affected:

  • (modified) mlir/docs/Dialects/Transform.md (+4)
  • (modified) mlir/include/mlir/Dialect/DLTI/DLTI.h (+3-2)
  • (modified) mlir/lib/Dialect/DLTI/DLTI.cpp (+58-62)
  • (modified) mlir/test/Dialect/DLTI/query.mlir (+119-10)
diff --git a/mlir/docs/Dialects/Transform.md b/mlir/docs/Dialects/Transform.md
index 37fcc0c8880335..5f79116dd00b55 100644
--- a/mlir/docs/Dialects/Transform.md
+++ b/mlir/docs/Dialects/Transform.md
@@ -427,6 +427,10 @@ ops rather than having the methods directly act on the payload IR.
 
 [include "Dialects/DebugExtensionOps.md"]
 
+## DLTI Transform Operations
+
+[include "Dialects/DLTITransformOps.md"]
+
 ## IRDL (extension) Transform Operations
 
 [include "Dialects/IRDLExtensionOps.md"]
diff --git a/mlir/include/mlir/Dialect/DLTI/DLTI.h b/mlir/include/mlir/Dialect/DLTI/DLTI.h
index f268fea340a6fb..8ff04927052f21 100644
--- a/mlir/include/mlir/Dialect/DLTI/DLTI.h
+++ b/mlir/include/mlir/Dialect/DLTI/DLTI.h
@@ -24,8 +24,9 @@ class DataLayoutEntryAttrStorage;
 } // namespace mlir
 namespace mlir {
 namespace dlti {
-/// Perform a DLTI-query at `op`, recursively querying each key of `keys` on
-/// query interface-implementing attrs, starting from attr obtained from `op`.
+/// Perform a DLTI-query at `op`, by recursively querying each key of `keys` on
+/// `DLTIQueryInterface`-implementing attributes of an op, attempting this query
+/// procedure for all ancestors of `op` in turn, starting with `op` itself.
 FailureOr<Attribute> query(Operation *op, ArrayRef<DataLayoutEntryKey> keys,
                            bool emitError = false);
 } // namespace dlti
diff --git a/mlir/lib/Dialect/DLTI/DLTI.cpp b/mlir/lib/Dialect/DLTI/DLTI.cpp
index 508e50d42e4cf2..ab0ca936099884 100644
--- a/mlir/lib/Dialect/DLTI/DLTI.cpp
+++ b/mlir/lib/Dialect/DLTI/DLTI.cpp
@@ -8,17 +8,13 @@
 
 #include "mlir/Dialect/DLTI/DLTI.h"
 #include "mlir/IR/Builders.h"
-#include "mlir/IR/BuiltinAttributes.h"
 #include "mlir/IR/BuiltinDialect.h"
 #include "mlir/IR/BuiltinOps.h"
-#include "mlir/IR/BuiltinTypes.h"
 #include "mlir/IR/Dialect.h"
 #include "mlir/IR/DialectImplementation.h"
 #include "llvm/ADT/TypeSwitch.h"
 
-#include "llvm/ADT/TypeSwitch.h"
 #include "llvm/Support/Debug.h"
-#include "llvm/Support/MathExtras.h"
 
 using namespace mlir;
 
@@ -489,77 +485,77 @@ void TargetSystemSpecAttr::print(AsmPrinter &printer) const {
 // DLTIDialect
 //===----------------------------------------------------------------------===//
 
-/// Retrieve the first `DLTIQueryInterface`-implementing attribute that is
-/// attached to `op` or such an attr on as close as possible an ancestor. The
-/// op the attribute is attached to is returned as well.
-static std::pair<DLTIQueryInterface, Operation *>
-getClosestQueryable(Operation *op) {
-  DLTIQueryInterface queryable = {};
-
-  // Search op and its ancestors for the first attached DLTIQueryInterface attr.
-  do {
-    for (NamedAttribute attr : op->getAttrs())
-      if ((queryable = dyn_cast<DLTIQueryInterface>(attr.getValue())))
-        break;
-  } while (!queryable && (op = op->getParentOp()));
-
-  return std::pair(queryable, op);
-}
-
 FailureOr<Attribute>
 dlti::query(Operation *op, ArrayRef<DataLayoutEntryKey> keys, bool emitError) {
+  InFlightDiagnostic diag = op->emitError() << "target op of failed DLTI query";
+
   if (keys.empty()) {
-    if (emitError) {
-      auto diag = op->emitError() << "target op of failed DLTI query";
+    if (emitError)
       diag.attachNote(op->getLoc()) << "no keys provided to attempt query with";
-    }
+    else
+      diag.abandon();
     return failure();
   }
 
-  auto [queryable, queryOp] = getClosestQueryable(op);
-  Operation *reportOp = (queryOp ? queryOp : op);
-
-  if (!queryable) {
-    if (emitError) {
-      auto diag = op->emitError() << "target op of failed DLTI query";
-      diag.attachNote(reportOp->getLoc())
-          << "no DLTI-queryable attrs on target op or any of its ancestors";
-    }
-    return failure();
-  }
-
-  Attribute currentAttr = queryable;
-  for (auto &&[idx, key] : llvm::enumerate(keys)) {
-    if (auto map = dyn_cast<DLTIQueryInterface>(currentAttr)) {
-      auto maybeAttr = map.query(key);
-      if (failed(maybeAttr)) {
-        if (emitError) {
-          auto diag = op->emitError() << "target op of failed DLTI query";
-          diag.attachNote(reportOp->getLoc())
-              << "key " << keyToStr(key)
-              << " has no DLTI-mapping per attr: " << map;
+  auto interleaveComma = [](ArrayRef<DataLayoutEntryKey> keys) {
+    std::string buf;
+    llvm::interleave(
+        keys, [&](auto key) { buf += keyToStr(key); }, [&]() { buf += ","; });
+    return buf;
+  };
+
+  // Recursively replace `currentAttr` by the attribute obtained by querying a
+  // new key on each new `currentAttr` until all `keys` have been exhausted -
+  // `atOp` is only used for error reporting.
+  auto queryKeysOnAttribute = [&](Attribute currentAttr,
+                                  Operation *atOp) -> FailureOr<Attribute> {
+    for (auto &&[idx, key] : llvm::enumerate(keys)) {
+      if (auto map = dyn_cast<DLTIQueryInterface>(currentAttr)) {
+        auto maybeAttr = map.query(key);
+        if (failed(maybeAttr)) {
+          if (emitError)
+            diag.attachNote(atOp->getLoc())
+                << "key not present - failed at keys: ["
+                << interleaveComma(keys.take_front(idx + 1)) << "]";
+          return failure();
         }
+        currentAttr = *maybeAttr;
+      } else {
+        // The previous key, if any, is responsible for the current currentAttr.
+        if (idx > 0 && emitError)
+          diag.attachNote(atOp->getLoc())
+              << "attribute at keys [" << interleaveComma(keys.take_front(idx))
+              << "] is not queryable";
         return failure();
       }
-      currentAttr = *maybeAttr;
-    } else {
-      if (emitError) {
-        std::string commaSeparatedKeys;
-        llvm::interleave(
-            keys.take_front(idx), // All prior keys.
-            [&](auto key) { commaSeparatedKeys += keyToStr(key); },
-            [&]() { commaSeparatedKeys += ","; });
-
-        auto diag = op->emitError() << "target op of failed DLTI query";
-        diag.attachNote(reportOp->getLoc())
-            << "got non-DLTI-queryable attribute upon looking up keys ["
-            << commaSeparatedKeys << "] at op";
-      }
-      return failure();
     }
+    return currentAttr;
+  };
+
+  // Run over all ancestors of `op`, starting the recursive attribute query for
+  // each ancestor which has an attribute on which we can perform a query.
+  for (Operation *ancestor = op; ancestor; ancestor = ancestor->getParentOp()) {
+    DLTIQueryInterface queryableAttr;
+    // NB: only the op's first DLTI attr will be inspected
+    for (NamedAttribute attr : ancestor->getAttrs())
+      if (auto queryableAttr = dyn_cast<DLTIQueryInterface>(attr.getValue())) {
+        auto maybeAttr = queryKeysOnAttribute(queryableAttr, ancestor);
+        if (succeeded(maybeAttr)) {
+          diag.abandon();
+          return maybeAttr;
+        }
+      }
+  }
+
+  if (emitError) {
+    if (diag.getUnderlyingDiagnostic()->getNotes().empty())
+      diag.attachNote(op->getLoc())
+          << "no DLTI-queryable attrs on target op or any of its ancestors";
+  } else {
+    diag.abandon();
   }
 
-  return currentAttr;
+  return failure();
 }
 
 constexpr const StringLiteral mlir::DLTIDialect::kDataLayoutAttrName;
diff --git a/mlir/test/Dialect/DLTI/query.mlir b/mlir/test/Dialect/DLTI/query.mlir
index 3825cee6f16164..c5fbf67dc7c2b2 100644
--- a/mlir/test/Dialect/DLTI/query.mlir
+++ b/mlir/test/Dialect/DLTI/query.mlir
@@ -203,16 +203,47 @@ module attributes {transform.with_named_sequence} {
 
 // -----
 
+// Demonstation of nested lookup by walking ancestors and co-commitant shadowing.
+
+// expected-remark @below {{associated CPU attr at module 42 : i32}}
+// expected-remark @below {{associated GPU attr at module 43 : i32}}
 module attributes { test.dlti = #dlti.target_system_spec<"CPU" = #dlti.target_device_spec<"test.id" = 42 : i32>,
                                                          "GPU" = #dlti.target_device_spec<"test.id" = 43 : i32>> } {
-  // expected-remark @below {{associated attr 24 : i32}}
+  // expected-remark @below {{associated CPU attr at func 24 : i32}}
+  // expected-remark @below {{associated GPU attr at func 43 : i32}}
   func.func private @f() attributes { test.dlti = #dlti.target_system_spec<"CPU" = #dlti.target_device_spec<"test.id" = 24 : i32>> }
 }
 
 module attributes {transform.with_named_sequence} {
   transform.named_sequence @__transform_main(%arg: !transform.any_op) {
     %func = transform.structured.match ops{["func.func"]} in %arg : (!transform.any_op) -> !transform.any_op
-    %param = transform.dlti.query ["CPU","test.id"] at %func : (!transform.any_op) -> !transform.any_param
+    %module = transform.get_parent_op %func : (!transform.any_op) -> !transform.any_op
+    // First the CPU attributes
+    %cpu_func_param = transform.dlti.query ["CPU","test.id"] at %func : (!transform.any_op) -> !transform.any_param
+    transform.debug.emit_param_as_remark %cpu_func_param, "associated CPU attr at func" at %func : !transform.any_param, !transform.any_op
+    %cpu_module_param = transform.dlti.query ["CPU","test.id"] at %module : (!transform.any_op) -> !transform.any_param
+    transform.debug.emit_param_as_remark %cpu_module_param, "associated CPU attr at module" at %module : !transform.any_param, !transform.any_op
+    // Now the GPU attribute
+    %gpu_func_param = transform.dlti.query ["GPU","test.id"] at %func : (!transform.any_op) -> !transform.any_param
+    transform.debug.emit_param_as_remark %gpu_func_param, "associated GPU attr at func" at %func : !transform.any_param, !transform.any_op
+    %gpu_module_param = transform.dlti.query ["GPU","test.id"] at %func : (!transform.any_op) -> !transform.any_param
+    transform.debug.emit_param_as_remark %gpu_module_param , "associated GPU attr at module" at %module : !transform.any_param, !transform.any_op
+    transform.yield
+  }
+}
+
+// -----
+
+module attributes { test.dlti = #dlti.target_system_spec<"CPU" = #dlti.target_device_spec<"test.id" = 42 : i32>,
+                                                         "GPU" = #dlti.target_device_spec<"test.id" = 43 : i32>> } {
+  // expected-remark @below {{associated attr 43 : i32}}
+  func.func private @f() attributes { test.dlti = #dlti.target_system_spec<"CPU" = #dlti.target_device_spec<"test.id" = 24 : i32>> }
+}
+
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(%arg: !transform.any_op) {
+    %func = transform.structured.match ops{["func.func"]} in %arg : (!transform.any_op) -> !transform.any_op
+    %param = transform.dlti.query ["GPU","test.id"] at %func : (!transform.any_op) -> !transform.any_param
     transform.debug.emit_param_as_remark %param, "associated attr" at %func : !transform.any_param, !transform.any_op
     transform.yield
   }
@@ -220,6 +251,58 @@ module attributes {transform.with_named_sequence} {
 
 // -----
 
+// Demonstation of nested lookup by walking ancestors and co-commitant shadowing.
+
+// expected-remark @below {{associated CPU attr at module 42 : i32}}
+// expected-remark @below {{associated GPU attr at module 43 : i32}}
+module attributes { test.dlti = #dlti.map<"CPU" = #dlti.map<"test.id" = 42 : i32>,
+                                          "GPU" = #dlti.map<"test.id" = 43 : i32>> } {
+  // expected-remark @below {{associated CPU attr at func 42 : i32}}
+  // expected-remark @below {{associated GPU attr at func 43 : i32}}
+  func.func @f(%A: tensor<128x128xf32>) {
+    // expected-remark @below {{associated CPU attr at matmul 24 : i32}}
+    // expected-remark @below {{associated GPU attr at matmul 43 : i32}}
+    %0 = linalg.matmul { test.dlti = #dlti.target_system_spec<"CPU" = #dlti.target_device_spec<"test.id" = 24 : i32>> } ins(%A, %A : tensor<128x128xf32>, tensor<128x128xf32>)
+                        outs(%A : tensor<128x128xf32>) -> tensor<128x128xf32>
+    // expected-remark @below {{associated CPU attr at constant 42 : i32}}
+    // expected-remark @below {{associated GPU attr at constant 43 : i32}}
+    arith.constant 0 : i32
+    return
+  }
+}
+
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(%arg: !transform.any_op) {
+    %constant = transform.structured.match ops{["arith.constant"]} in %arg : (!transform.any_op) -> !transform.any_op
+    %matmul = transform.structured.match ops{["linalg.matmul"]} in %arg : (!transform.any_op) -> !transform.any_op
+    %func = transform.structured.match ops{["func.func"]} in %arg : (!transform.any_op) -> !transform.any_op
+    %module = transform.get_parent_op %func : (!transform.any_op) -> !transform.any_op
+    // First query at the matmul
+    %cpu_matmul_param = transform.dlti.query ["CPU","test.id"] at %matmul : (!transform.any_op) -> !transform.any_param
+    transform.debug.emit_param_as_remark %cpu_matmul_param, "associated CPU attr at matmul" at %matmul : !transform.any_param, !transform.any_op
+    %gpu_matmul_param = transform.dlti.query ["GPU","test.id"] at %matmul : (!transform.any_op) -> !transform.any_param
+    transform.debug.emit_param_as_remark %gpu_matmul_param, "associated GPU attr at matmul" at %matmul : !transform.any_param, !transform.any_op
+    // Now query at the constant
+    %cpu_constant_param = transform.dlti.query ["CPU","test.id"] at %constant : (!transform.any_op) -> !transform.any_param
+    transform.debug.emit_param_as_remark %cpu_constant_param, "associated CPU attr at constant" at %constant : !transform.any_param, !transform.any_op
+    %gpu_constant_param = transform.dlti.query ["GPU","test.id"] at %constant : (!transform.any_op) -> !transform.any_param
+    transform.debug.emit_param_as_remark %gpu_constant_param, "associated GPU attr at constant" at %constant : !transform.any_param, !transform.any_op
+    // Now query at the func
+    %cpu_func_param = transform.dlti.query ["CPU","test.id"] at %func : (!transform.any_op) -> !transform.any_param
+    transform.debug.emit_param_as_remark %cpu_func_param, "associated CPU attr at func" at %func : !transform.any_param, !transform.any_op
+    %gpu_func_param = transform.dlti.query ["GPU","test.id"] at %func : (!transform.any_op) -> !transform.any_param
+    transform.debug.emit_param_as_remark %gpu_func_param, "associated GPU attr at func" at %func : !transform.any_param, !transform.any_op
+    // Now query at the module
+    %cpu_module_param = transform.dlti.query ["CPU","test.id"] at %module : (!transform.any_op) -> !transform.any_param
+    transform.debug.emit_param_as_remark %cpu_module_param, "associated CPU attr at module" at %module : !transform.any_param, !transform.any_op
+    %gpu_module_param = transform.dlti.query ["GPU","test.id"] at %module : (!transform.any_op) -> !transform.any_param
+    transform.debug.emit_param_as_remark %gpu_module_param, "associated GPU attr at module" at %module : !transform.any_param, !transform.any_op
+    transform.yield
+  }
+}
+
+// -----
+
 module attributes { test.dlti = #dlti.target_system_spec<
   "CPU" = #dlti.target_device_spec<
     "cache::L1::size_in_bytes" = 65536 : i32,
@@ -298,7 +381,7 @@ module attributes {transform.with_named_sequence} {
 
 // -----
 
-// expected-note @below {{key "NPU" has no DLTI-mapping per attr: #dlti.target_system_spec}}
+// expected-note @below {{key not present - failed at keys: ["NPU"]}}
 module attributes { test.dlti = #dlti.target_system_spec<
     "CPU" = #dlti.target_device_spec<"test.id" = 42 : i32>,
     "GPU" = #dlti.target_device_spec<"test.id" = 43 : i32>> } {
@@ -317,7 +400,7 @@ module attributes {transform.with_named_sequence} {
 
 // -----
 
-// expected-note @below {{key "unspecified" has no DLTI-mapping per attr: #dlti.target_device_spec}}
+// expected-note @below {{key not present - failed at keys: ["CPU","unspecified"]}}
 module attributes { test.dlti = #dlti.target_system_spec<
     "CPU" = #dlti.target_device_spec<"test.id" = 42 : i32>,
     "GPU" = #dlti.target_device_spec<"test.id" = 43 : i32>> } {
@@ -336,7 +419,7 @@ module attributes {transform.with_named_sequence} {
 
 // -----
 
-// expected-note @below {{key "test.id" has no DLTI-mapping per attr: #dlti.target_system_spec}}
+// expected-note @below {{key not present - failed at keys: ["test.id"]}}
 module attributes { test.dlti = #dlti.target_system_spec<
   "CPU" = #dlti.target_device_spec<"test.id" = 42 : i32>,
   "GPU" = #dlti.target_device_spec<"test.id" = 43 : i32>> } {
@@ -355,7 +438,7 @@ module attributes {transform.with_named_sequence} {
 
 // -----
 
-// expected-note @below {{key "CPU" has no DLTI-mapping per attr: #dlti.dl_spec}}
+// expected-note @below {{key not present - failed at keys: ["CPU"]}}
 module attributes { test.dlti = #dlti.dl_spec<"test.id" = 42 : i32> } {
   // expected-error @below {{target op of failed DLTI query}}
   func.func private @f()
@@ -372,7 +455,7 @@ module attributes {transform.with_named_sequence} {
 
 // -----
 
-// expected-note @below {{got non-DLTI-queryable attribute upon looking up keys ["CPU"]}}
+// expected-note @below {{attribute at keys ["CPU"] is not queryable}}
 module attributes { test.dlti = #dlti.dl_spec<"CPU" = 42 : i32> } {
   // expected-error @below {{target op of failed DLTI query}}
   func.func private @f()
@@ -389,7 +472,7 @@ module attributes {transform.with_named_sequence} {
 
 // -----
 
-// expected-note @below {{got non-DLTI-queryable attribute upon looking up keys [i32]}}
+// expected-note @below {{attribute at keys [i32] is not queryable}}
 module attributes { test.dlti = #dlti.dl_spec<i32 = 32 : i32> } {
   // expected-error @below {{target op of failed DLTI query}}
   func.func private @f()
@@ -423,7 +506,7 @@ module attributes {transform.with_named_sequence} {
 
 // -----
 
-// expected-note @below {{key i64 has no DLTI-mapping per attr: #dlti.map<i32 = 32 : i64>}}
+// expected-note @below {{key not present - failed at keys: ["width_in_bits",i64]}}
 module attributes { test.dlti = #dlti.map<"width_in_bits" = #dlti.map<i32 = 32>>} {
   // expected-error @below {{target op of failed DLTI query}}
   func.func private @f()
@@ -483,4 +566,30 @@ module attributes {transform.with_named_sequence} {
     %param = transform.dlti.query ["test.id"] at %funcs : (!transform.any_op) -> !transform.param<i64>
     transform.yield
   }
-}
\ No newline at end of file
+}
+
+// -----
+
+// expected-note @below {{attribute at keys ["CPU","test"] is not queryable}}
+module attributes { test.dlti = #dlti.map<"CPU" = #dlti.map<"test" = {"id" = 0}>> } {
+  // expected-note @below {{key not present - failed at keys: ["CPU","test","id"]}}
+  func.func @f(%A: tensor<128x128xf32>) attributes { test.dlti = #dlti.map<"CPU" = #dlti.map<"test" = #dlti.map<"ego" = 0>>> } {
+    scf.execute_region { // NB: No notes/errors on this unannotated ancestor
+      // expected-note @below {{key not present - failed at keys: ["CPU","test"]}}
+      // expected-error @below {{target op of failed DLTI query}}
+      %0 = linalg.matmul { test.dlti = #dlti.target_system_spec<"CPU" = #dlti.target_device_spec<"test.id" = 24 : i32>> } ins(%A, %A : tensor<128x128xf32>, tensor<128x128xf32>)
+                          outs(%A : tensor<128x128xf32>) -> tensor<128x128xf32>
+      scf.yield
+    }
+    return
+  }
+}
+
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(%arg: !transform.any_op) {
+    %matmul = transform.structured.match ops{["linalg.matmul"]} in %arg : (!transform.any_op) -> !transform.any_op
+    // expected-error @below {{'transform.dlti.query' op failed to apply}}
+    %cpu_matmul_param = transform.dlti.query ["CPU","test","id"] at %matmul : (!transform.any_op) -> !transform.any_param
+    transform.yield
+  }
+}

@rolfmorel
Copy link
Contributor Author

rolfmorel commented Nov 5, 2024

This PR introduces the most straightforward look-up semantics: just continue traversing outer scopes, i.e visiting ancestor ops, until a multi-key query succeeds in full on a DLTI attribute on one of the ancestor ops of the target op.

This look-up semantics implies that it is not possible to "hide" names that have been defined in DLTI attributes on ancestor ops - it is only possible to redefine them, that is to "shadow" the existing name by associating some other value. Pretty standard in terms of scoped lookup, actually. However, as the following is allowed in syntax we might want to ponder what it should mean:

module attributes { test.dlti = #dlti.map<"CPU" = #dlti.map<"tile_size" = 64, "pad_to" = 16>> } {
  func.func @f(%A: tensor<128x128xf32>) attributes { test.dlti = #dlti.map<"CPU" = {}> } {
    linalg.matmul { test.dlti = #dlti.map<"CPU" = #dlti.map<"tile_size" = 24 : i32>> } ...
    arith.constant 0 : i32
    return
  }
  func.func @g() {
    ...
  }
}

Given that the user overrode/shadowed the "leading" key "CPU" with the non-DLTI attribute {}, what consequence should this have for attempting to query keys ["CPU","pad_to"] at the linalg.matmul inside @f? And what about queries at arith.constant inside @f?

With the PR as is the answer to query(%handle_to_matmul, ["CPU","pad_to"]) is 16 and query(%handle_to_matmul, ["CPU","tile_size"]) is 24 and query(%handle_to_constant, ["CPU","pad_to"]) is 16 as well and query(%handle_to_constant, ["CPU","tile_size"]) is 64.

@rolfmorel rolfmorel changed the title [MLIR][DLTI] Make queries visit all ancestors/respect nested scopes [MLIR][DLTI] Make DLTI queries visit all ancestors/respect nested scopes Nov 6, 2024
Copy link
Contributor

@adam-smnk adam-smnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the overall concept - feels intuitive and has good examples

@rolfmorel
Copy link
Contributor Author

@ftynse, if you could have a look at this PR, that would be much appreciated. Thanks!

@ftynse
Copy link
Member

ftynse commented Nov 19, 2024

I have several points here, take them under advisory, you guys are driving this subsystem at this point anyway.

  1. The example from the second post in this conversation affords a different solution. Shadowing a DLTI attribute with a non-DLTI attribute is invalid and should be checked by the verifier. Not insisting on it, but I wouldn't expect such shadowing in legitimate use cases and rather write it off as incorrect IR construction, at which point it becomes preferable to warn the user.
  2. The initial design considered the case of nested layouts by providing the hook to merge layout specs: . We could defer to specific entries to indicate whether they are "overridable" in certain contexts. I was thinking about, e.g., alignment where nested objects shouldn't just relax alignment requirements. This is in line with reporting an error in the case above. I recall there being some verifier, but not sure if it was completely implemented or required https://discourse.llvm.org/t/connecting-op-verifiers-to-analyses/3553 to proceed.
  3. The intention with that design was to allow specs to be propagated down once and merged when constructing a DataLayoutAnalysis object that would serve for repeated queries. Mutating the IR invalidates the analysis, as is customary. I suppose this may co-exist, if properly documented, with a direct interface usable for one-off queries walking up.
  4. My personal opinion is the caching mechanism in the data layout object was implemented prematurely, but we reached consensus on its theoretical necessity.

My recommendation would be to collect some performance data, if possible, to support or not the need for the cache. Otherwise, we can fallback to using our common knowledge from other similar problems in MLIR or other compilers to inform the decision about caching. I don't have a strong intuition there beyond this infra being under-utilized, where I'd rather go for the implementation that is not embarrassingly bad without prematurely optimizing it.

@rengolin
Copy link
Member

  1. Shadowing a DLTI attribute with a non-DLTI attribute is invalid and should be checked by the verifier.

Agree.

  1. We could defer to specific entries to indicate whether they are "overridable" in certain contexts

Agree, but this is very complex. You could also ask how they're overridable (sub-range, only-increase, only-decrease, replace entirely), and if the overriding is for that node only or all sub-nodes. I'd rather leave that discussion for when we really need it.

  1. Mutating the IR invalidates the analysis, as is customary.

We discussed this a lot downstream. One idea is to have the top-level DLTI dictionary to be invariant ("came from above") and leave the result of any analysis (that could be invalidated) as new attributes to functions, regions, operations. Those could be invalidated by transformations. But again, this brings infrastructure complexity and we should only design when we need it.

  1. My personal opinion is the caching mechanism in the data layout object was implemented prematurely, but we reached consensus on its theoretical necessity.

Good, we reached a similar conclusion. Caching brings all sorts of invalidation problems, especially in view of multi-threaded compilers (isolated-from-above) and central source of information.

My recommendation would be to collect some performance data, if possible, to support or not the need for the cache.

This should be an exercise for another day. Right now, as far as we know, the users of this dialect do not need caching and implementing fully-coherent caching system in MLIR just for DLTI would be overkill.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants